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Abstract — Quantitative cliaracterization of randomly roving 
agents in wireless sensor networks (WSN) is studied. Below the 
formula simplifications, regarding the known results and publica- 
tions, it is shown that the basic agent model is probabilistically 
equivalent to a similar simpler model and then a formula for 
frequencies is achieved in terms of combinatorial second kind 
Stirling numbers. Stirling numbers are well studied and different 
estimates are known for them letting to justify the roving agents 
quantitative characteristics. 

I. Introduction 

This work, inspired by [3], li5], [I], considers roving agents' 
numerical characterization, challenging ad hoc pervasive and 
trustworthy networks. Agents are autonomous, moving, and 
intelligent software structures capable to play a sensitive role 
in advanced monitoring, computation and protection systems. 
Intrusion detection systems (IDS) |3| are addressed particu- 
larly. They appear as complementary mean to the ordinary 
cryptographic protection tools of computers and networks. 
Such IDS use software agent based monitoring and data collec- 
tion, watching the inside processes of a computer, registering 
LOG files of application software systems, sniffing and record- 
ing communication protocols. Watching the whole network 
behavior they are better suited to warn approaching attacks and 
malfunctioning. Data mining agents (DMA) and Data fusion 
agents (DFA) are examples of information integration tools in 
networks |5 1. In large networks, moreover when its structure is 
not predefined such as wireless sensor networks ||1] it is natural 
to consider independent, randomly roving agents, requiring 
that they are able to collect enough information in total, mining 
the necessary knowledge about the intrusion. This framework 
is studied in |5 1, which prove formulas for the number of DMA 
sufficient to monitor the given size areas of networks. The 
formula received is complex and impractical because of their 
use of nested sums by different parameters. Our work tends to 
prove simple estimates for the same numerical characteristics 
of WSN. 

II. Roving Agents Model 

DMA roams around randomly in a network and acquires 
environmental information. It is lightweight using simplest 
mining algorithms. DFA is for integration of DMA set actions. 
DFA may act as an intrusion detection tool and then its power 
depends on information collected by DMA in network. 



Let we are given a network N of n nodes ui , . . . , u„. Some 
fixed amount of information di is allocated at node Vi. There 
are k DMA ai, . . . , Ofe. Each agent visits exactly m different 
nodes and obtains the unique information content in each such 
node. DMA pass all collected information to DFA. Denote 
by Pk{n,m,t) the probability that DFA contains exactly t 
information blocks of network nodes when k agents randomly 
visit m of n nodes each. The formula for Pk{n, m, t) proven 
in ||5] looks as: 
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Formulas for smaller fc given in ||5l look similar to |T]). Of 
course these formulas are unobservable and simplifications or 
approximations are of interest. By this same reason [5] proves 
formulas, considering computer simulation, to understand the 
typical numbers of agents necessary to retrieve the required in- 
formation in network. Modifications of "exactly t" condition in 
agent distribution scheme are also important to be considered. 

III. Coverage Characterization of Roving Agents 

Let we are given the set N — {vi,...,Vn} of nodes 
and Si, . . . , Sk are fc arbitrary subsets of N, each of size 
m < n, visited correspondingly by the fc agents. We consider 
a probability distribution scheme over the N, and suppose 
that TO-subsets Sj are equiprobable and independent in this 
scheme. Having in total C™ TO-subsets the probability of one 
of them is equal to 1/C™. We are interested in knowing the 
probabilistic characteristics of the union ujLj^S'i and its size. 
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Fig. 1. Agent sets distribution in terms of trials and node sets. Left column 
contains outcomes of fc by m trials (each Ti is a ordered collection of k 
m-subsets). Right column contains all the subsets of node set A'^. 



In particular, what is the probabiHty that union of those subsets 
contains exactly t elements? 



Pk{n, in, t) — Pr 
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To collection of subsets Si, . . . ,Sk of N nodes corresponds 
a matrix A'^^'"' — {uij} where 



1 if Vj G Si 
otherwise 



(3) 



As each Si contains exactly ni elements then each row of 
j^kxn ^jjj contain m Is and n — m Os. If ju^L^S";! = t, then 
there are t columns of A which contain at least one 1 and 
n — t columns which don't contain 1. The number of fc x n 
matrixes with m ones on each row and with exactly n — t 
columns with no 1 is C*j • Q{k, m, t) where Q{k, m, t) is the 
number of k x t matrixes with m ones on each row and at 
least one 1 on each column. 

Alternatively, let us consider the following schematic pre- 
sentation of roving agents' distribution. Left column vertices 
in the scheme presented in Fig. [1] contain all the arrangements 
Ti,T2,... of k agents roving by C,™ m-node-subsets (or- 
dered collections of k rn-node-subsets). From combinatorial 
perspective agents and nodes are distinguishable but m-node- 
subsets are considered as usual sets - different elements and 
no ordering. Total number of arrangements is equal to (C™)'". 
Part of these arrangements cover exactly t nodes and let that 
these are vertices Ti,T2, . . . ,Tp. In this notation p is the 
unknown number that we want to compute. Right side column 
vertices correspond to all subsets of node set N and part of 
these sets are of size t. In principle, node subset sizes may 
vary from to n but in our experiment it may take values 
from m to min(fcm,n). 

We draw an edge between an arrangement and a node subset 
which is covered by that arrangement. Each arrangement 
is incident to exactly one edge (and subset). Each f-subset 
appears in different arrangements and this number is common 
for all i-subsets and is given by Q{k,m,t). 



Q{k,m,t) can be calculated by inclusion-exclusion princi- 
ple. We use the matrix model for arrangements. First, over 
a k X t matrix we take the whole set of unconstrained 
arrangements as all matrices with rn Is on rows, then we 
remove from this all the arrangements where at least one 
column is initially filled with (such matrices do not obey 
the conditions we require), then add arrangements with at least 
2 empty columns, etc. The formula representation of related 
quantities is: 
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Theorem 1. 
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First of all here we receive a real simplification of ([T]). The 
formula received is still complex, but it might be approximated 
and the applied Markov inequality may give asymptotic esti- 
mates of t-subset probabilities 14|. 

Another important characteristic, the mean value of subset 
size t, might be computed as: 
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IV. On Node Repetition Limitations in An Agent 
Roving Scheme 

Let us consider the scene of random distribution of m 
agents over the n WSN nodes (here we do not consider k 
agents but m agents, and each individual agent visits exactly 
one node). Agents are dropped over the node set one by 
one, independently, and with equal probabilities for nodes. 
Allocating all m agents we receive a collection of nodes visited 
by agents, probably with multiple agents that visited the same 
node. 

Total number of different allocations is n™. Among these 
are 1 node allocations (all the agents visit the same node), their 
number is n, 2 node allocations, they are (2™ — 2) and 
the largest are m node allocations (m-sets), when agents are 
distributed in all different nodes, and they are n(ri — 1) . . . (ri- 
TO+l). We are interested in the frequencies of allocation sizes 
when at least 2 agents are allocated at the same node (sizes 
from 1 to m — 1), or complementary, the share of allocations 
with all different nodes. 

One of the classical approaches of determining typical cases 
in distributions is when Markov or Chebyshev inequality is 
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Fig. 2. Agents distribution on WSN node sets. Left column contains outcomes 
of m trials (each Si is a ordered collection of ra nodes), right column contains 
triples, node and two different agents 



Fig. 3. Allocations by and , 



In one individual stage of Un.k.m we have ml orderings of 
a single allocation of m-subset of one step of Un.k.{rn}- 

This 

is to be taken into account comparing the schemes Un^k^{m} 
and Un,k,m- This difference can also be seen comparing the 
one stage outcomes of Un^k,{m} ^nd Un,k,m- Represent C™ 
of model C/„,fe,{„i} as 

n! n{n — 1) . . . (ri — m + 1) 

m\{n ~ m)\ to! 



applied. In this way we consider the scheme presented in Fig. 
ID similar to one presneted in Fig. [T|to compute the mean of 
the number of allocated nodes in random distribution of to 
agents over the n WSN nodes. 

Thus, the number of right side vertices in the scheme, where 
each vertex is a triple, node and a pair of agents, is nC^^. 
Edges are connecting an allocation (from left column) to a 
node with the given pair of agents it contains (right column). 
We compute the mean number Mivn.m) of edges incident to 
each allocation as 



M(t;„,„,) 
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Apply Markov inequality Pr {vn.m > e} < M{vn.m)/£- 
Take e = 1, then C^/n is the upper estimate of probability of 
repeating agents at nodes. If C^n/n — ?> with n, to -> oo, then 
we receive that almost all allocations consist of all different 
agents at nodes. 

V. Comparison of Agent Allocation Schemes 

In this point we will define and consider two basic proba- 
bility distributions tightly related to each other 

• First distribution Un.k.{m} is composed by k indepen- 
dent consecutive allocations of TO-node subsets over the 
WSN area of n nodes. (C™)'^ Outcomes of trials are 
ordered collections of TO-subsets of WSN nodes. These 
collections may cover all node subsets of sizes from to 
to min(fcTO, n). 

• Second distribution scheme Un,k,m, which we want 
to consider and compare with the basic distribution 
Un,k,{m} considered above, consists of k consecutive 
and independent stages; each stage allocates to elements 
consecutively and independently over the WSN area of 
n nodes. Outcomes of these trials are all ti'"" ordered 
collections of nodes. These collections may cover all node 
subsets of sizes from 1 to min(fcTO, n). 



Numerator of the last ratio is the counterpart of n'" of model 
Un.k.{m}^ and to! is the coefficient we mentioned about. 
Comparing Un^k,{m} and Un^k.m, first we note that outcomes 
of Un^k,{m} are part of outcomes of Un,k.m and hence they 
may have higher probabilities. Consider the probability pj of 
an event, that in stage j of Un,k,m, all the allocated to elements 
are different. Then P ~ pi ■ P2 ■ ■ ■ ■ ■ Pk is the probability 
that in all k stages allocated to elements are different. In 
different stages allocations of course may intersect. Outcomes 
of Un.k,{m} multiplied with this probabilities are equal to 
probabilities of Un,k,m, part B of intersection of outcomes 
(Fig. O. Pj Was estimated in previous point as a value 
tending to 1 asymptotically. We may extend this proposition 
to the entire value P. Formally we use the property that 
probability of union of events is less or equal the sum of 
event probabilities; 



Pr {{vn,m > e|g = 1) V . . . V {vn,m > e\q = k)} < 
<k- Pr {vn,„i >f\< ■ 
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Then the final condition (upper estimate) sufficient for 
repetition probability tending to zero is kC^/n with 
n,m,k oo. The sufficient condition for allocation of all to 
agents in all k consecutive stages to different nodes kw? /n 
is naturally acceptable in WSN which have a very large 
nodes set as a rule. Final picture is: part B allocations (Fig. 
|3]l appear in Un,k,m with probability P tending to 1; relative 
probability distribution among the elements of B is identical 
in C/„,fc,{m} and Un,k,7n\ event probability in model Un^k,{m} 
is not less than in Un,k,m multiplied by P; probabilities of 



similar to the ones for model U„ 



,k,{m} 



considered above. 



If R{k,m,t) denotes the number of t-node allocations in 
model Un,k,m then the formal representation of R{k,m,t) 
similar to the formula for Q{k,m,t). Considered above can 



be achieved by the same inclusion exclusion method: 



R{k, m, t) = f "'^ - Cl ■ {t - 1)'"''^ + ■ it - 2)"'^' - . . . 

. . . + (-i)*~ic*-i 1))"'' = 

t-i 

= ^(-l)'Q • (i-z)™*^. (10) 
On this basis we formulate 

Theorem 2. If kC'^/n — > with n,m,k oo, then 
comparison of J7„,fc,{m} ^'"'^ Un,k,m model probabilities of 
t-node allocations are by relation 

^^(^.P<^^(^,w./.P^1. (11) 

Finally, we note that i?(fc, m, t) has equivalent presentation 
in terms of second kind Stirling numbers ([ZJ) 

5(iV, if ) - ;^ f](-l)^C^(i^ - (12) 

Here we used the fact that allocation of k consecutive and 
independent stages of m elements over the WSN area of n 
nodes is equivalent to allocation of km elements over that 
area. Note a difference between the formulas for Q{k,m,t) 
and R{k, m, t) - that is summation limits. In case of R(k, m, t) 
formally we may add the zero term for i — t, and then we 
receive 

R{k,m,t) ^t\S{mk,t) (13) 
which is the final postulation of this paper 

VI. Conclusion 

WSN and software agent systems are important application 
technique for many areas. Being hard algorithmically and 
complex in model level these systems require special econ- 
omy regimes and this is concerned in knowing the minimal 
requirements and maximum effect when resource is limited. In 
randomly roving agents model, which is considered above, it is 
shown that appearing probabilities are equivalently presented 
in terms of combinatorial Stirling numbers and due to known 
asymptotic formulas for these numbers ([2]), this allows to 
adopt the monitoring regime in an optimal way. 
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